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Mechanical unfolding and refolding of ubiquitin are studied by Monte Carlo simulations of a Go model 
with binary variables. The exponential dependence of the time constants on the force is verified, and fold- 
ing/unfolding lengths are computed, with good agreement with experimental results. Furthermore the model 
exhibits intermediate kinetic states, as observed in experiments. Unfolding and refolding pathways and interme- 
diate states, obtained by tracing single secondary structure elements, are consistent with simulations of previous 
all-atom models and with the experimentally observed step sizes. 



Understanding folding and unfolding pathways remains 
one of the major challenges in protein science. Thermal and 
chemical denaturation have been studied for decades, and 
in recent years new experimental techniques, where single 
molecules are manipulated by atomic force microscopy and 
optical tweezers, collectively referred to as force spectroscopy 
III I2I y, bl |6[], have given a new perspective on this prob- 
lem. In typical experiments a protein is pulled by its ends, and 
the unfolding and refolding processes are monitored by mea- 
suring its end-to-end length as a function of time. In force 
clamp experiments |5, 6] the force is kept constant by a feed- 
back system. To unfold a molecule, the force is suddenly in- 
creased from a small to a large value, and the reverse is done to 
let it refold. Typically, unfolding and refolding turn out to be 
two-state processes with a characteristic time which depends 
exponentially on the force according to an Arrhenius-like law, 
but (un)folding intermediates are also observed, as in ubiqui- 
tin. This is a small (76 aminoacids, pdb code lubq), highly 
stable protein which has been the subject of many studies (see 
e.g. |7] andrefs. therein). The native state contains an a-helix 
(ffi: residues 23-34), a short 3io helix {02'- 56-59) and a 5- 
stranded y6-sheet forming tertiary contacts with the a-helix. 
The strands, in the order in which they appear in the sheet, are 
j82 (10-17),/3i (1-7), /35 (64-72), /33 (40-45) and/34 (48-50). 

Ubiquitin unfolding is characterized by a time constant 
which depends exponentially on the force, and by a distance 
from the native to the transition state equal to x„ = 0.17 nm 
fstl • The unfolding transition is signalled by a 20 nm increase 
in the end-to-end length. In a limited number of cases (5% 
of the total) however, a different unfolding pathway was ob- 
served, where the protein unfolds in two steps, characterized 
by 8 and 12 nm length increases respectively. This has been 
attributed to the existence of a partially folded intermediate 
state containing a\, fix and [32 1 5], though in that experiment 
the structure of such state has not been directly probed. 

The refolding also exhibits a rich behaviour f^'], with an 
initial rapid elastic recoil, followed by an intermediate state 
characterized by large length fluctuations, and a final transi- 
tion to the folded state. The folding time turns out to follow 
an Arrhenius-like law, with a distance from the extended to 
the transition state estimated as jc/ = 0.8 nm |0]. 

These experimental results have prompted a series of com- 
putational studies 10, S U. 12, 12] aimed at reproduc- 



ing the general behaviour and elucidating the details of the 
unfolding and refolding pathways and the nature of the in- 
termediate state. Irback and coworkers |@, |^ suggest, on 
the basis of Monte Carlo (MC) simulations of an all-atom 
model, that the most probable unfolding pathway corresponds 
to PiPi — > y8ij62 — » (i-iPi — > PiPA — > a\, i.e., the contacts 
between y6-strands 1 and 5 are the first to yield, followed by 
those between strands 1 and 2, and so on until finally the a- 
helix yields. Furthermore, in the typical unfolding intermedi- 
ate found in that paper, ai, Ps, and ^64 are still folded, in 
marked contrast with jstl • 

Li et al. ifioll verified, using molecular dynamics (MD) 
simulations of a Ca Go model, that unfolding and refold- 
ing times depend exponentially on the force, with jc„ - 0.24 
nm and Xf - 0.96 nm. They did not find an unfolding in- 
termediate, and attributed this to the lack of non-native in- 
teractions. They distinguished three cases, where the force 
is applied to (a) both termini, (b) N-terminus only, and (c) 
C-terminus only. In cases (a) and (c) they obtained that 
the secondary structure elements (SSEs) break in the order 
Px — > j62 — » jSs — > ^63 — > ^64 — > a, while in (b) they found 
l3,^l33^l3A^Px^l32^a. 



Kleiner and Shakhnovich 111 111 found, using MC simulations 
of an all-atom Go model, that unfolding starts with the sepa- 
ration of j6i, j62 and ^65 from the rest of the structure and from 
each other, followed by the separation of ^63 and Pa from ai 
and from each other, and finally by the unfolding of the he- 
lices. Their typical trajectory shows a plateau in the end-to- 
end length, which they associate to an unstable intermediate 
where Pi and P2 are unfolded and Ps is about to unfold. 



Szymczak and Cieplak Ill2l 11311 observed, in MD simula- 
tions of a Ca Go model, that, during folding, the hairpin P1P2 
forms at the beginning if both extremities are left free, while 
it forms at the end if the N-terminus is held fixed. 

Motivated by the discrepancies between these results we 
have studied the mechanical unfolding and refolding of ubiq- 
uitin by means of MC simulations of a simplified G5 model 
with binary variables [14]. We have recently developed this 
model as a generalization of a model originally proposed by 
Wako and Saito \ \3\ in a purely thermodynamic version and 
subsequently reconsidered by Munoz, Eaton and coworkers 
II7II . who used a kinetic version of the model to analyze 
experimental results. In the last few years it has been shown 
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that the model equiHbrium thermodynamics can be solved ex- 
actly 1 18], and that it successfully describes the kinetics of 
protein folding | 2^ 21 , 22 , 23ll. Our generalized model 
for protein mechanical unfolding [ 14], has been shown to ex- 
hibit the typical response of proteins to external loading, al- 
lowing one to estimate the unfolding length of a titin domain, 
in excellent agreement with the experimental value, and with a 
very limited computational effort. In the model a protein made 
up of -H 1 aminoacids is described as a chain of peptide 
bonds: a binary variable is associated to each bond and can 
live in two states (native and unfolded, = 1,0). Given the 
values one can identify native-like stretches (which can be 
as short as a single aminoacid and as long as the whole chain) 
delimited by unfolded bonds. A stretch goes from bond / to j 
if and only if (1 -m,) Y[i<k<j mkil-nij) - 1, and to each stretch 
we associate (i) a native length Ijj taken from the pdb \2^, and 
(ii) an orientational variable cr,-,- = +1, where -i-l(-l) means 



parallel (antiparallel) to the applied force Il25ll . Detailed defi- 
nitions have been given in uM, where the parameter choice is 
also described. The energy scale for the ubiquitin model turns 
out to be e/ks - 156.6K, and the force unit is fixed so as to 
match the experimental characteristic unfolding force /„ ^ 35 
pN Lfij 2M ■ The unfolding and refolding kinetics are studied 
by MC simulations with single-variable-flip Metropolis dy- 
namics: the model time scale fo corresponds to a single MC 
step, the temperature is taken to be T - 300 K. In order to 
monitor the unfolding of SSEs, we define the order param- 
eter for each SSE as the fraction of its peptide bonds in the 
native (folded) state: mj}^ - Zfl,' "^/t^ with s = 1 . . .5, 
and where yS-strand s includes aminoacids from to js, and 
similarly we define m„ , r = 1, 2 for the helices. 

In the present work we consider the protein unfolding in- 
duced by a force clamp |5]. Typically, in such experiment, 
the average unfolding time is given by the Arrhenius' law 
(Tu) = To exp [-/xu/iksT)], where / is the external force and 
Xu the unfolding length. We start from the completely folded 
state, with / = 0, and then we apply a constant force / and 
sample the unfolding time t„ over 1000 trajectories. In fig.[I] 
the mean unfolding time is plotted as a function of the ex- 
ternal force /. The force dependence is clearly exponential 
at small forces and saturates at larger forces, as noticed in 
II12II . From a fit of the data in fig. [T]to the Arrhenius' law 
we find x„ - 1.8 + 0.1 A, in excellent agreement with the 
experimental value found in ISfl. The fit is performed in the 
same range of forces considered in fl, 50 < / < 250 pN. In 
ref. 101, the ubiquitin unfolding process has been shown to de- 
pend on the pulling vector, relative to the structure. Following 
that idea, we applied the force / to the portion of molecule 
spanning from the 48th aminoacid (Lys48) to the C terminus 
(aminoacids 48-76). We found bistability, signaling the un- 
folding transition, at / ~ 100 pN, in reasonable agreement 
with the average unfolding force of 85 pN measured in |2]. 
We then measured (t„) as a function of / (fig. [U. From the 
fit we get x[i = 4.1 + 0.5 A, which is larger than Xu of the 
whole molecule, signalling a softer structure, in agreement 
with ref. This value of x'^ is slightly smaller than that 
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FIG. 1 : Average unfolding time t„ as a function of the force /, ap- 
plied to the whole molecule (□) and to the portion of molecule span- 
ning from the 48th to the 76th aminoacid (O)- Lines are exponential 
fits. Inset: Refolding time as a function of the quenched force /i , and 
with /o = 100 pN. The line is an exponential fit to the data. 
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TABLE I: Probability that the row-index SSE unfolds before the 
column-index one, with / = 100 pN and m,, = 1/3. 



found in (6.3 A), however in that work a different experi- 
mental set up was used: the dynamic loading set-up. We plan 
to address this discrepancy, as well as to give full description 
of the unfolding of the "48-76"-structure, in a future work. 

We probe the unfolding pathway by sampling the order pa- 
rameters of the single SSEs: m^^ and ma,^- We checked that in 
the folded state, these order parameters take the value 1 most 
of the time, with rare fluctuations where they take smaller val- 
ues. Let fg^ (ta,.) be the time at which strand (helix a^) 
unfolds, defined as the time at which the corresponding or- 
der parameter crosses a certain threshold m„ for the first time. 
Then, averaging over 1000 trajectories, we compute the prob- 
ability that, during unfolding at / = 100 pN, a SSE unfolds 
before another one. Results are reported in Table H] and show 
clearly that the typical pathway starts with the simultaneous 
unfolding of /3i and /32, followed by the simultaneous unfold- 
ing of y63, /34 and ^65, and finally by the helices unfolding. In 
Fig. |2] we plot the order parameters m^,, (for hairpin /3i - Pi, 
residues 1-17), m^,^ (residues 40-50) and mp^, in a time in- 
terval containing the unfolding event of a typical trajectory. 
Three regions can be clearly distinguished. The top-left one 
corresponds to the folded state, the rightmost one to the un- 
folded state, and the central one to an intermediate state. In 
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FIG. 2: (color online). Order parameters nip^^ (black), mp^^ (red) and 
mp^ (green) as functions of time, with / = 100 pN, for a typical 
trajectory. Inset: end-to-end length L (in A) of the model molecule 
as a function of time. 



this intermediate state fix and ^62 are unfolded, [i^ and ^64 are 
folded and (3^ fluctuates (as it already did in the folded state). 
The presence of the intermediate state is also signalled by a 
plateau in the end-to-end length, see inset of fig.|2] This state 
appears in all but a few trajectories. Since its lifetime is widely 
varying we argue that when we do not see it in a trajectory, this 
is just due to the time resolution limit. The unfolding pathway 
is then unique, exhibiting an intermediate state with a fluctuat- 
ing lifetime. We have checked that the distribution of the time 
difference tp^^ - f^,-,, which measures the time elapsed between 
the unfolding of the hairpins fix -^2 and ySs -fin, exhibits a sin- 
gle peak (data not shown), while in the case of two different 
pathways one would expect a two-peaked distribution. 

Our pathway is consistent with the all-atom models lIsl fTlll 
and the Ca Go model by Li et al. 1, 10,,] (except for the case 
in which the force is applied to the N-terminus only). In the 
experimental reference [5] the intermediate was attributed to 
the partial unfolding of the strands fix and ^2 and of the a- 
helix, although this conclusion was based only on the length 
of the single strands, their unfolding trajectories being not ex- 
perimentally accessible. On the contrary, in ref. [8J the inter- 
mediate state were identified to be composed by ^83, [3 a and 
ySs and the a-helix, as found in the present work, though ^65 
is fluctuating here. Also in ref. Il 111 it was found that in the 
intermediate state fix and [32 are unfolded, while Ps separates 
from ^63 along the plateau which characterizes the intermedi- 
ate. The apparent discrepancy between theoretical and exper- 
imental scenarios for the intermediate state can be reconciled 
if one considers that the only direct information which comes 
from the experiments is that in the first unfolding step a por- 
tion of the size of 28 aminoacids unfolds [J]. The hairpin 
Px - Pi, plus the loop which connects it to the ff-helix, mea- 
sures 22 aminoacids, and the remaining 6 aminoacids can be 
attributed to the fluctuations of the strand ^65, that we observe 
in the intermediate state. 

We now turn to the analysis of refolding, by considering a 
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TABLE II: Probability that the row-index SSE refolds before the 
column-index one, with /o = 232 pN, /i = 23.2 pN. 



protein which is initially completely unfolded, at equilibrium 
with a large force /q. Then, at f = 0, the stretching force is 
quenched to a low value fx , and the folding trajectory of the 
protein is monitored. In fig. [3] a typical trajectory is shown: 
the order parameter m (fraction of native peptide bonds [14] ) 
and end-to-end length L are plotted as functions of time. It 
is worth noting that, after the quench at f = 0, we observe a 
fast elastic recoil, where the length jumps to a value L ^ 100 
A, while the order parameter m increases more gradually, in- 
dicating that the molecule is not yet structured. After this 
recoil, the length and the order parameter exhibit large fluc- 
tuations. This stage is followed by another abrupt contrac- 
tion in the length, where the protein reaches its equilibrium 
length for the small force applied. This is accompanied by a 
marked increase in the order parameter, which shows that the 
molecule is now fully folded. This behaviour is the same that 
was found by Fernandez and Li |6] in their experimental ob- 
servation of refolding of ubiquitin under force quenching. It is 
worth stressing that in a subset of trajectories, they found that 
the last transition can be split in two stages, however the order 
of refolding of the SSEs during these two stages could not be 
sampled. We find that the refolding pathway is similar to the 
reverse unfolding one, see table|II] The helices and the strands 
^63, are the first SSEs to refold, fix and (32 fold at the final 
stage of the process, while (3^ folds randomly between fi-i - Pa 
and Px -Pi'- thus in addition to the fast elastic recoil the model 
exhibits also a refolding intermediate (see the insets in Fig.O. 
It is tempting to associate this intermediate to the two stages 
observed in ref. |0] in the last transition. Finally, we observe 
that the average refolding time depends also exponentially on 
the force In < t/ >~ fxf, as shown in the inset of Fig. [U and 
the corresponding folding length \s Xf - 6.7 + 0.8 A, in rea- 
sonable agreement with the experimental value xj- - 8 A |@] 
and with the value obtained by Li et al. 1.10] . 

In conclusion, we have shown that a simple Go model with 
binary variables can account for the main features observed 
in the mechanical unfolding and refolding of ubiquitin. This 
model is, to the best of our knowledge, the simplest one with 
sufficient details to allow the study of specific molecules. We 
believe that this model may be a useful tool to investigate the 
interplay between the protein native structures and their un- 
folding and refolding pathways. Finally, we want to stress 
that the ubiquitin refolding cannot be investigated by all-atom 
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FIG. 3: Refolding under constant force: molecular order parameter 
m and length L as functions of the time, /o = 232 pN, /i = 23.2 pN. 



simulations because of the huge computation time required. 
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In our model the native length Ijj is independent of the force. 
This simplification leads to overall lengths slightly smaller than 
the experimental ones, as the extensibility of the native stretches 
is not taken into account. 

Our assumption on orientations corresponds to a reduction in 
configuration space and hence in the entropy of unfolded bonds: 
as a consequence we have to rescale the "natural" force unit 
6/(.lnm) by a factor 5.4, in such a way that the force /„ where 
the molecule elongation is half of its maximum value, corre- 
sponds to the experimental unfolding force /„ = 35 pN @]. 



